1^12^78 

UNCLASSIFIED 


VLSI  ARRAV  PROCESSOR  FOR  SIGNAL  PROCESSING(U) 

UN I VERS I  TV  OF  SOUTHERN  CALIFORNIA  LOS  ANGELES  DEPT  OF 
ELECTRICAL  ENGINEERING  S  RUNG  IS  NOV  82  1 
N88014-88-C-0457 


171 


F7G  1772 


NL 


V&K 


-ms  i^v  AD  A 122  078 


VLSI  ARRAY  PROCESSOR  FOR  SIGNAL  PROCESSING 
UNIVERSITY  or  SOOT  caufomia 
fual  repost 

CONTRACT  >0.:  100014  -  80  -  C  -  0457 

Sponsored  hy 

OFFICE  OP  NAVAL  RESEARCH 


Covering  Research  Activity  During  the  Period 
1  April  1980  through  31  March  1982 


Principal  Investigator  • 

Department  of  Electrical  Engineering  -  Syeteas 
Los  Angeles,  California  90089  -  0272 


Apptovod  m  public  ro lease; 
Distribution  Unlimited 


DTIC 

SELECTE 
DEC  6  1982 

D 


FINAL  REPORT 


CONTRACT  NO.  N00014  -  80  -  C  -  0457 


VLSI  ARRAY  PROCESSOR  FOR  SIGNAL  PROCESSING 


Sponsored  By 

OFFICE  OF  NAVAL  RESEARCH 


Sun-Yuan  Rung 
Principal  Investigator 
University  of  Southern  California 
Department  of  Electrical  Engineering  -  Systems 
Los  Angeles,  California  90089  -  0272 
(213)  743-7281 


Covering  Research  Activity  During  the  Period 
1  April  1980  through  31  March  1982 


Aoeeaalon  For 


By- - - 

Distribution/ 


UNCLASSIFIED 


SECURITY  CLASSIFICATION  or  THIS  MW  fill  Of  Km.r.Q 

REPORT  DOCUMENTATION  PAGE  bKro^axJEiwG^ronsi  _  j 

•  id^ORT  NUMBER  '  •  It.  OOVT  ACCESSION  MO.  S.  RECIPIENT'S  CATALOG  N-.IMBCR  ■“  J 


4.  title  c««f  *>*</</•; 

VLSI  Array  Processor  for  Signal  Processing 


S.  TYPE  OF  REPORT  *  PEP;Ot>  COVERED  j 

Final  Report  * 

1  April  1980  -  31  Kerch  1982; 

4.  PERFORMING  ORO.  REPORT  NUMkER  J 


AUTNOR/4 

Sun-Yuan  Rung 


t.  PERFORMING  ORGANIZATION  NAME  AMS  ADDRESS 

University  of  Southern  California 

Department  of  Electrical  Engineering  -  Systems 

Los  Angeles,  California  90089-0272 _ 

11  CONTROLLING  OFFICE  NAME  AND  ADDRESS  -  - 

Office  of  Naval  Research 
1030  E.  Green  Street 

W.  MONITORING  AGENCY  NAME  a  ADOREivW  SUtWfii  tnm  CmUfllini  O/Hrmj 


S.  CONTRACT  OR  GRANT  NUHCritf*? 

N00014  -  80  -  C  -  0457 

10.  PROGRAM  ELEMENT.  •’RCJECT.  '  mSK 
AREA  •  WORN  UNIT  NUMBERS 


It.  REPORT  DATE 

Nov.  16,  1982 

IS.  NUMBER  OF  PAGES 

23 _ 

IS.  SECURITY  CLASS.  ;ol  thlt  rtferi) ' 

UNCLASSIFIED 

1 19T  oeclassifica'tion/doangradik: 
SCHEDULE 


I*.  DISTRIBUTION  STATEMENT  (a I  thh  Report) 


Approved  for  release;  distribution  unlimited. 


<7.  DISTRIBUTION  STATEMENT  fa/  A.  MfUcI  •*!»•» #4 to  tlMt  SB,  It  Mlhnitl  Nasi  RapatfJ 


!  is.  SUPPLEMENTARY  notes 


M>»  KlY  WOODI  fCwKww  n  itftfgg  tl#>  it  iw— ty  md  ty  Mttfc  »wn>M 


M.  AMTNACT  (CMNmm  m  miwn  •!#•  K  im  rcusry  m4  identity  ftp  UhI  turner)  \ 

This  report  describes  the  research  activities  performed  by  tnst^University 
of  Southern  California  for  the  period  1  April  1980  to  31  March  1982suq|er  the  j 
Contract  No.*  800014  -  80  -  C  -  0457  with  the  Office  of  Naval  ResearchTThe  re-  I 
search  activities  have  focussed  on  the  VLSI  array  processor  for  signal  proces¬ 
sing  theory  and  algorithms  and  the  development  of  parallel  computing  architec¬ 
tures. 

A  solution  in  today's  VLSI  research  challenge  lies  in  a  cross-disciplinary 
research  encompassing  the  areas  of  mathematics,  algorithms,  computers,  and  ap-  , 


.ar»  1471 


coition  of  >  nov  ss  is  obsolete 

S/M  Q102-t>-0l4-  4*01 


UNCLASSIFIED 

SECURITY  CLASSIFICATION  O 


IE  fWRwi  0atel 


/ 


■■I*1* 


•ocwrv  CL*mnc*no»  or  t*m»  poor  tmsm  qm*  cm««4 


ABSTRACT 


This  report  describes  the  reeeerch  activities  performed  hy  the 
University  of  Southern  California  for  the  period  1  April  1980  to  31  March 
1982  under  the  Contract  Mo.:  N00014  -  80  -  C  -  0457  with  the  Office  of 
■aval  Research.  The  research  activities  have  focussed  on  the  VLSI  array 
processor  for  signal  processing  theory  and  algorlthns  and  the  development 
of  parallel  computing  architectures. 

A  solution  in  today's  VLSI  research  challenge  lies  In  a 
cross-disciplinary  research  encompassing  the  areas  of  mathematics, 
algorithms,  computers,  and  applications.  To  this  end,  this  report 
summarises  two  parallel  major  research  tasks:  (1)  Signal  processing 
algorithm  and  theory  and  (2)  parallel  computing  structures. 


IWIODPCTIOH 


With  the  rapidly  g rowing  microelectronics  technology  loading  the  way, 
aodorn  signal  processing  is  undergoing  a  major  revolution.  The 
availability  of  low  cost,  fast  VLSI  devices  promises  the  practice  of 
increasingly  camples  and  sophisticated  algorithms  and  systems.  However,  in 
eoajsmctloa  with  such  promise,  there  is  sccompanled  a  new  challenge  of  how 
to  update  the  signal  processing  techniques  so  ss  to  effectively  utilise  the 
large-scale  computation  capability.  The  answer  to  this  challenge  lies  in  a 
cross-disciplinary  research  encompassing  the  areas  of  mathematics, 
algorithms,  computers  and  applications.  To  this  end,  two  parallel  major 
research  tasks  have  been  undertaken  in  the  OMR  research  group: 

(1)  Signal  processing  algorithm  and  theory  -  emphasising  spectral 
analysis  and  its  applications; 

(2)  Parallel  computing  structures  -  utilising  VLSI  potential  for 
high-speed  signal  processing. 

In  the  area  of  signal  processing  theory  and  algorithms,  significant 
work  has  been  msde  on  the  following  topics  with  special  emphasis  on  high 
resolution  spectral  estimation:  adaptive  notch  filtering;  MEM,  AXMA,  and  ML 
spectral  estimation  in  1-0  and  2-0;  and  Toepllts  approximation.  Parallel 
Implementation  of  these  algorithms  has  been  a  major  consideration  in  their 
development . 

In  the  complimentary  area  of  parallel  computing  structures,  both 
dedicated  end  flexible  architectures  have  been  developed  for  signal 
processing  tasks  and  applications.  Works  in  progress  include:  Toepllts 
system  solver  using  pipelined  Levinson  and  implementation  and  programmable 


vmfront  array  proeaaaor  and  data  flow  language  for  VLSI  algnal  procaaaing 
algorithms,  systolic  arrays  for  real-tlae  signal  processing  applications  in 
spactrun  analysis  and  direction  finding  and  systolic  architectures  for 
ladder  fozas  and  parallel  Kalaan  filters. 

A  brief  suaaary  of  the  technical  work,  grouped  in  tarns  of  research  is 
described  in  the  following  sections. 

Signal  Processing  Algoritha  and  Theory 

As  to  the  first  research  front,  it  hinges  upon  a  thorough,  in-depth 
understanding  of  aathenatlcs  and  algoritha  analysis.  In  addition  to  the 
classical  aatheaatical  techniques  such  as  Fourier  transfora,  linear  dynaalc 
systeas,  randoa  process,  etc.,  there  arises  a  new  signal  processing 
aathaaatlcs  breach  which  can  be  grossly  teraed  as  aodem  spectral  analysis. 
Explicitly  or  not,  a  large  class  of  signal  processing  applications  have  had 
extensive  use  of  this  analysis  as  a  technical  basis.  Therefore,  our 
research  effort  alas  at  developing  a  theoretical  and  algorlthaic  basis  for 
aodem  spectral  analysis  aethods  and  signal  processing  applications. 


Adaptive  Wotch  Filtering 

Using  a  ateady  state  frequency  domain  approach,  a  new  aethod  has  been 
developed  for  the  retrieval  of  slnusoids/narrowband  signals  in  additive 
noise  colored  or  Shite.  The  aethod  suggested  has  been  shown  to  require 
saaller  filter  length  to  produce  unbiased  estlaates,  compared  to  the 
existing  autoregressive  aethod.  For  its  implementation,  a  pole-aero  filter 
^lere  the  feedback  and  feedforward  coefficients  are  related  (constrained 
AMA),  baa  been  developed.  A  study  of  the  performance  and  iaplaaentatlonal 


Mjieti  of  the  filter  here  been  undertaken.  The  deteile  of  thie  newly 
developed  ere  dleeneeed  In  the  full  report.  For  e  eteble  implementation , 
parallel  end  eeecade  forme  have  been  ehown  to  be  uaeful.  A  parallel 
processing  scheme  developed  shows  greet  promise. 

Spectral  Estimation 

Our  recent  research  has  been  concerned  with  developing  systematic 
methods  for  2-D  spectral  estimation  from  raw  data  using  random  field 
models.  He  assume  that  the  given  finite  data  is  represented  by  an 
appropriate  Gaussian  Markov  random  field  (MRF)  model. 

By  using  specific  finite  toroidal  lattice  representations  and  Gaussian 
maximum  likelihood  estimates  we  have  developed  new  2-D  spectral  estimates. 
It  turns  out  that  the  MRF  spectrum  Is  also  the  maximum  likelihood  spectrum 
arising  In  frequency-wave  number  analysis.  Furthermore,  the  sample 
correlation  values  of  the  given  observations  In  an  array  N  are  In  perfect 
agreement  with  the  estlamted  theoretical  correlations  In  N  obtained  by 
Fourier  Inverting  the  MRF  spectrum.  Thus  the  MRF  spectrum  developed  by  us 
converges  to  the  2-D  maximum  entropy  spectral  estimate  asymptotically. 
Currently  we  have  begun  Investigations  on  parallel  lmplementetlon  of  the 
algorithms  for  2-D  spectral  estimation. 


It  uy  i 


too  anbltlous  to  coapart  all  currently  popular 


high-resolution  spectral  eit last ion  act hod a.  For  example,  while  maximum 
entropy  aethod  related  to  autoregress ive  aodellng  is  receiving  a  treaendous 
popularity,  it  aay  suffer  fron  bias  and  resolution  problens  when  additive 
noise  is  non-negliglble.  On  the  other  hand,  Pisarenko's  aethod  based  on 
sinusoidal  aodellng  enjoys  relatively  better  perforaance  in  the  presence  of 
noise  but  in  general  suffers  froa  nuaerical  sensitivity  problens.  However, 
froa  a  difference  perspective,  Pisarenko's  aethod  can  be  viewed  as  an 
extension  of  the  MEM  aethod  with  the  reaoval  of  the  noise  contribution. 
Therefore,  an  atteapt  is  being  aade  at  developing  a  unified  fraaevork  for 
the  spectral  analysis  techniques.  Moreover,  the  unification  atteapt  is 
being  extended  to  the  counterpart  of  spectral  analysis  in  array  processing 
application.  Though  the  covariance  aatrix  will  no  longer  have  a  Toeplits 
structure  and  the  phasing  vectors  are  sore  coaplex  in  array  processing 
situations,  we  are  convinced  that  the  general  principles  raaaln  largely 
applicable.  We  are  currently  looking  into  theoretical  and  computational 
relevances  between  several  aodern  array  processing  and  spectrum  est last ion 
methods . 

Toeplits  Approximation  Method 

tecently,  the  atudy  on  approximation  teory  and  its  applications  has 
received  considerable  attention.  In  our  work,  a  narrowband/sinusoidal 
algnal  retrieval  problem  is  formulated  in  terns  of  approximation  of 
Toeplits  autocovariance  matrix.  A  Toeplits  approximation  method  based  on 


Our  tMMt  mMreb  at  Image  FrottwUg  Imstitota  at  USC  has  bean 
eoocaraad  with  parallel  algorithms  for  Image  proeeaeing  and  image  analysis. 
Meet  of  the  effort  has  been  eon  earned  with  parallel  implementation  of 
nonstationary  adaptive  image  restoration.  Recursive  and  non- recursive 
Implementation  of  locally  adaptive  restoration  has  been  studied.  these 
techniques  estimate  the  local  nonstationary  naan  and  variance  of  ideal 
scenes  from  degraded  data.  Most  blurring  degradations  are  also  highly 
local,  so  that  local  parallel  processing  combined  with  the  nonstationary 
image  model  data  can  ba  used  to  nlnlnise  local  mean- square  error  (MSE)  in  a 
parallel  fashion.  We  have  shown  that  local  MSE  is  not  a  bad  error 
criterion  for  inage  processing,  as  opposed  to  the  usual  global  MSP  taken 
over  the  entire  scene.  Global  MSE  often  does  not  correlate  well  with  human 
observer  Judgments  of  image  quality. 

We  have  looked  at  the  application  of  these  techniques  to  systems  with 
coherent  speckle  noise,  such  as  synthetic  aperture  (SAR)  imagery,  coherent 
sonar  sad  acoustic  imaging.  Both  recursive  (Kalman- like)  and  local 
^letloned  parallel  Implementations  are  being  studied  in  detail. 

In  addition,  we  have  began  investigations  on  parallel  feature 
extraction  for  texture  identification  and  texture  segmentation. 

A  Parallel  Algorithm  for  Solving  Toeplltx  System 


Me  have  developed  a  parallel  algorithm  for  solving  a  Toeplltx  syst< 


T*  •  y  where  T  is  a  Toepllts  Matrix,  i.a., 
ITJ^j  •  t^_j  “  t^,  -B  <  k  <  H.  In  general,  solving  an  N  by 
B  linear  systeas  takes  0(H**3)  steps  of  operations.  In  contrast,  the 
Levinson  algorithm  effectively  utilises  the  Toepllts  structure  to  reduce 
the  overall  computation  to  0(K**2)  operations.  The  Levinson  procedure, 
however,  has  to  call  upon  an  inner  product  operation  to  compute  the  vital 
reflection  coefficients.  In  order  to  achieve  full  parallelism,  we  have  to 
further  exploit  the  Toepllts  structure.  For  this  purpose,  we  have  proposed 
e  new,  pipelined  version  of  the  Levinson  algorithm  which  allows  the 
reflection  coefficients  to  be  computed  in  a  pipelined  fashion.  This  avoids 
the  need  of  the  inner  product  operations,  and  the  total  computing  time  is 
therefore  reduced  to  0(H). 


Toepllts  Eigenvalue 


This  research  task  deals  with  the  parallel  computation  of  the  minimum 
eigenvalue  of  a  Toepllts  matrix.  The  minimum  eigenvalue  has  an  important 
Interpretation  as  the  power  of  additive,  white  noise  to  be  determined  in 
noisy  statistical  environment.  In  many  high  resolution  spectrum  analysis 
problems,  the  estimation  and  removal  of  such  noise  contribution  is 
essential  for  unbiased  estimates.  Our  objective  is  again  to  derive 
an  0(H)  Computation  algorithm  to  estimate  the  minima  eigenvalue  of  a 
given  Toepllts  coverlance  matrix.  This  goal  can  be  accompanied  by  adopting 
the  pipelined  Toepllts  computing  structure  discussed  earlier  and  a  careful 
utilisation  of  a  relationship  between  the  minimum  eigenvalue  and  the 
radlusee  E  that  arise  in  the  Levinson  algorithm.  Based  on  this 
relationship,  a  fast  iterative  procedure  is  developed  to  successively 


estimate  the  minimum  eigenvalue.  Based  on  simulation  results  for  such  an 
application,  some  improvements  are  observed  in  both  the  computing  speed  as 
well  as  accuracy  of  estimates.  Although  much  more  computational  complexity 
analysis  Is  yet  to  be  demonstrated,  we  are  convinced  that  this  approach 
will  have  a  major  in  future  applications  of  high  speed,  high  resolution 
spectrum  estimation  problems. 

Application  of  SVD  to  Signal  Processing 

It  is  well  known  that  SVD  can  be  used  in  many  signal  processing 
applications.  Therefore  parallel  (real-time)  Implementation  has  been  an 
Important  research  focus.  Some  partial  results  are  offered  In  the  report. 
The  most  noteworthy  result  is  the  significant  numerical  Improvement 
of  60bd  in  terms  of  dynamic' range  obtained  in  the  computation  of 
eigenvalue  of  R  -  A  A  via  SVD  of  A.  This  approach  is  being  extended  to 
generalized  eigensystem  computation. 

Parallel  Algorithms  for  Seismic  Signal  Processing 

Parallel  Processing  techniques  for  generating  synthetic  seismograms 
and  for  the  computation  of  the  output  of  a  horizontally  stratified, 
non-absorptlve  medium  propagating  plane  waves  vertically;  have  been 
studied. 

Highly  Parallel  Computing  Structures 


The  aforementioned  research  effort  on  signal  processing  algorithm  and 
theory,  equipped  with  parallel  algorithms,  and  adaptive  on-line  processing 
techniques,  will  serve  as  a  useful  cornerstone  for  real-time  high 


performance  signal  processing  area.  However ,  the  real  major  thrust  for 
high-speed  signal  processing  lies  In  effective  utilisation  of  the  enormous 
computation  capability  provided  by  the  VLSI  circuits*  Therefore,  our 
research  task  alms  to  bring  the  revolutionary  VLSI  device  technology  to  an 
effective  signal  processing  application. 

Pipelined  Toepllts  System  Solver  t  J_ 

This  new  parallel  algorltnm  for  solving  Toepllts  system  can  be 
Implemented  for  perallel  computation  with  full  compliance  with  the  VLSI 
communication  constraint.  Specifically,  a  pipelined  processor  architecture 
with  0(H)  processors  is  developed  which  uses  only  localised 
Interconnections  and  still  retains  the  msxlata  parallelism  attainable. 


We  believe  that  the  proposed  pipelined  Toepllts  system  solver  [  J  Is 
perhaps  the  most  efficient,  fast,  and  practical  (In  VLSI  sense)  design 
available  for  solving  Toepllts  systems.  Moreover,  the  design  methodology 
demonstrated  In  this  work  should  also  help  answer  some  fundamental  problems 
faced  In  designing  of  VLSI  parallel  processor  architectures. 

Wavefront  Array  Processor 

The  traditional  design  of  parallel  computers  and  languages  is  not  very 
suitable  for  the  design  of  VLSI  array  processors  for  signal  processing. 
VLSI  imposes  the  restrictions  of  local  data-dependence  and  recurslvlty  on 
the  algorithms  that  can  be  handled  by  such  an  array  processor.  Such 
algorithms  can  be  viewed  as  a  sequence  of  waves  (of  data  and  computational 
activity).  This  naturally  leads  to  a  wavefront  based  programmable 
computing  network,  irtilch  we  call  the  Wavefront  Array  Processor  (WAP). 


Our  contribution  hinges  upon  the  development  of  e  wavefront- based 
language  end  architecture  for  a  programmable  special  purpose  multiprocessor 
VJ.1  array.  Based  on  the  notion  of  computational  wavefront,  the  hardware  of  the 
processor  array  is  designed  to  provide  a  computing  medium  that  preserves 
the  key  properties  of  the  wavefront.  In  conjunction,  a  wavefront  language 
(MDFL)  is  introduced  that  drastically  reduces  the  complexity  of  the 
description  of  parallel  algorithms  and  simulates  the  wavefront  propagation 
across  the  computing  network.  Together,  the  hardware  and  the  language  lead 
to  a  programmable  Wavefront  Array  Processor  (WAP).  The  WAP  blends  the 
advantages  of  the  dedicated  systolic  array  and  the  general  purpose 
Data-Flow  machine  and  provides  s  powerful  tool  for  the  high  speed  execution 
of  a  large  class  of  matrix  operations  and  related  algorithms  which  hsve 
widespread  applications . 


With  the  rapidly  growing  microelectronics  technology  leading  the  way, 
modern  signal  processor  architectures  are  undergoing  a  major  revolution. 
The  availability  of  low  coat,  fast  VLSI  (Very  Large  Scale  Integration) 
devices  premises  the  practice  of  cost-effective,  high  speed,  parallel 
processing  of  large  volume  of  data.  This  makes  possible  ultra  high 
throughput- rate  and  therefore,  designates  a  major  technological 
breakthrough  for  real-time  signal  processing  applications.  On  the  other 
hand,  it  has  beccme  more  critical  than  ever  to  gain  a  fundamental 
understanding  of  the  algorithm  structure,  architecture,  and 
implementation  constraints  in  order  to  realise  the  full  potential  of 
VLSI  computing  power.  In  our  work,  the  two  most  critical  issues 
-  parallel  computing  algorithm  and  VLSI  architectural  constraint  will  be 
considered  : 

1 .  To  structure  the  algorithm  to  achieve  the  maxima  parallelism 
and,  therefore,  the  maximum  throughput- rate. 

2.  To  cope  with  the  communication  constraint  so  as  to  compromise 
least  in  processing  throughput-rate. 

1.1  A  highly  concurrent  Toeplits  system  solver  f5-6] 

Based  on  the  above  considerations,  we  have  developed  a  highly 
concurrent  Toeplits  system  solver,  featuring  maxima  parallelism  and 
localised  communication. 

Toeplits  systems  arise  in  nuaerous,  wide-spread  applications  ranging 
from  speech,  Image,  neurophysics,  to  radar,  sonar,  geophysics,  and 
astronomical  signal  processing  .  Our  contribution  lies  in  the 


develoiment  of  a  highly  concurrent  algorithm  and  pipelined  architecture 
which  is  able  to  eolve  a  Tbeplitz  system  in  0(H)  processing  tine  in  an 
array  processor,  as  opposed  to  0(lf**3)  for  general  (sequential)  Gauss 
elimination  procedure  or  0(H**2)  for  (sequential)  Levinson  algorithm. 

Jbr  parallel  consideration,  ve  note  that  the  Levinson  procedure  has 
to  call  upon  an  inner  product  operation  to  compute  the  vital  "reflection 
coeffecients" .  Bren  when  N  processors  is  utilized,  an  inner  product 
operation  will  need  at  leaist  logN  units  of  time.  This  will  amount  to  a 
total  of  0(  HlogN  )  units  of  computing  time  for  the  entire  Levinson 
procedure.  This  is  of  course  unsatisfactory  since  the  processors  are  not 
effectively  utilized. 

In  order  to  achieve  full  parallelism,  ve  have  to  further  exploit  the 
Toeplitz  structure.  Bar  this  purpose,  ve  have  proposed  a  new,  pipelined 
version  of  the  Levinson  algorithm  which  allows  the  reflection 
coefficients  to  be  computed  in  a  pipelined  fashion.  This  avoids  the  need 
of  the  inner  product  operations,  and  the  total  computing  time  is 
therefore  reduced  to  0(H). 

This  new  algorithm  can  be  implemented  in  full  compliance  with  the 
VLSI  communication  constraint.  More  precisely,  a  pipelined  processor 
architecture  is  developed  which  uses  only  localized  Interconnections  and 
still  retains  the  maxima  parallel!*!  attainable. 

In  summary,  we  believe  that  the  proposed  pipelined  foeplits  system 
solver  is  perhaps  the  most  efficient,  fast,  and  practical  (in  VLSI 
sense)  design  available  for  solving  Tbeplitz  systems.  Moreover,  the 
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design  methodology  demonstrated  in  thin  work  should  also  hslp  answer 
sobs  fund  Mental  problems  faced  in  designing  of  VLSI  parallel  processor 
architectures. 


This  research  task  deals  with  the  parallel  computation  of  the  nininun 
eigenvalue  of  a  Toeplitz  matrix.  The  minimis  eigenvalue  has  an 
laportant  interpretation  as  the  power  of  additive*  white  noise  to  be 
detenained  in  a  noisy  statistical  envlronsent.  In  many  high  resolution 
spectriai  analysis  problems*  the  estimation  and  res  oval  of  such  noise 


contribution  is  essential  for  unbiased  estimates.  Our  objective  is 
again  to  derive  an  0(H)  coaputation  algorithm  to  estimate  the  minis un 
eigenvalue  of  a  given  Toeplitz  covariance  matrix.  This  goal  can  be 
accomplished  by  adopting  the  pipelined  Tbeplltz  computing  structure 
discussed  earlier  and  a  careful  utilisation  of  a  relationship  between 


the  mlniaui  eigenvalue  and  the  redisues  E 


that  arise  in 


the  Levinson  algorithm.  Based  on  this  relationship*  a  fast  iterative 
procedure  is  developed  to  successively  estimate  the  minimum  eigenvalue. 
Based  on  simulation  results  for  such  an  application,  some  improvements 


are  observed  in  both  the  compitlng  speed  as  well  as  accuracy  of 
estimates.  Although  much  more  computational  complexity  analysis  is  yet 
to  be  demonstrated,  we  are  convinced  that  this  approach  will  have  a 
major  in  future  applleationa  of  high  speed*  high  resolution  spectrum 
estimation  problems. 
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1.3  Wav  front  Array  Fwowgr 

The  traditional  design  of  parallel  computers  and  languages  usually 
auffers  from  heavy  aupervisory  overhead  incurred  by  synchronisation, 
ooaaunication,  and  scheduling  tasks,  which  severely  has  per  the 
throughput  rate  which  is  critical  to  real-tine  signal  processing. 

JUr  them  ore,  additional  restrictions  ia  posed  by  VLSI  will  render  the 
general  purpose  array  processor  very  inefficient.  Ve  therefore  restrict 
ourselves  to  a  special  class  of  applications,  i.e.  recursive  and  local 
data  dependent  algorithms,  to  confona  with  the  constraints  imposed  by 
VLSI,  however,  this  restriction  incurs  little  loss  of  generality,  aa  a 
great  majority  of  signal  processing  algorithms  possess  these  properties. 
One  typical  example  is  a  class  of  matrix  algorithms. 

Very  significantly,  these  algorithms  involve  repeated  application  of 
relatively  simple  operations  with  regular  localised  data  flow  in  a 
homogeneous  computing  network.  This  leads  to  an  Important  notion  of 
computational  wavefront,  which  portrays  the  computation  activities  in  a 
manner  resembling  a  wave  propagation  phenomenon.  Horn  precisely,  the 
recursive  nature  of  the  algorithm,  in  conjunction  with  the  localised 
data  dependency,  points  to  a  continuously  advancing  wave  of  data  and 
computational  activity. 

The  wavefront  concept,  provides  a  fira  theoretical  foundation  for  the 
design  of  highly  parallel  array  processors  and  concurrent  languages. 
Moreover,  this  concept  appears  to  have  some  distinct  advantages. 


firstly,  the  wavefront  notion  drastically  reduces  the  complexity  in 


the  description  of  parallel  algor it has.  The  mechanism  provided  for  this 
description  ia  a  special  purpose,  wavefront- oriented  language,  ftither 
than  requiring  a  progrms  for  each  processor  in  the  array,  this  language 
allows  the  prog rasa er  to  address  an  entire  front  of  processors. 

Secondly,  the  wavefront  notion  leads  to  a  wavefront-  based 
architecture  that  conforms  with  the  constraints  of  VLSI,  and  supports  a 
major  class  of  signal  processing  algorithms.  is  a  consequence  of 
Huygen's  principle,  wavefronts  should  never  intersect.  With  a  wavefront 
architecture  that  provides  asynchronous  waiting  capability,  this 
principle  is  preserved,  therefore,  the  wavefront  approach  can  cope  with 
timing  uncertainties,  such  as  local  clocking,  random  delay  in 
communications  and  fluctuations  of  computing- times.  In  short,  there  is 
no  need  for  global  synchronisation. 

Thirdly,  the  wavefront  notion  is  applicable  to  all  VLSI  signal 
processing  algorithms  that  possess  locality  and  recursivlty,  and  hence, 
has  numerous  applications. 

The  Integration  of  the  wavefront  concept,  the  wavefront  language  and 
the  wavefront  architecture  leads  to  a  programmable  computing  network, 
which  we  will  call  the  WXVEPHOHT  ARRJff  PROCESSOR  (WJPj.The  VJP  is, in  a 
sense, an  optimal  tradeoff  between  the  globally  aynchronised  and 
dedicated  systolic  array  (that  works  on  a  similar  set  of  algorithms), 
and  the  general-purpoae  data-flow  sultlproceasors.  It  provldea  a 
powerful  tool  for  the  high  speed  execution  of  a  large  claas  of 
algorlthna  which  have  widespread  applications.  The  applications  are 
very  broad  including  Phi  solver,  WD ,  linear  systana  solvers,  sorting 


and  Marching  routines 


there  exist  too  approaches  approaches  to  progressing  the  VJP:  a 
local  approach*  describing  the  actions  of  each  processing  el  ament,  and  a 
global  approach*  describing  the  actions  of  each  vavefront.  lb  allow  the 
user  to  prograi  the  YIP  in  both  theM  fashions*  two  versions  of  HDFL  are 
proposed  *  global  and  local  HDFL.  A  global  ID  PL  program  describes  the 
algorithm  frem  the  view-point  of  a  wavefront,  while  a  local  MDFL  program 
describes  the  operations  of  an  Individual  processor.  Hors  precisely, 
the  perspective  of  a  global  BDFL  prograsaer  is  of  one  wavefront  passing 
•cross  all  the  processors,  while  the  perspective  of  a  local  M)IL 
prograsaer  is  that  of  one  processor  encountering  a  series  of  wavefronts. 

In  siauary,  our  contribution  hinges  upon  the  development  of  a 
wavefront-baaed  language  and  architecture  for  a  programmable  special 
purpose  sulti processor  array.  Based  on  the  notion  of  computational 
vavefront,  the  hardware  of  the  processor  array  is  designed  to  provide  a 
computing  Sedius  that  preMrves  the  key  properties  of  the  wavefront.  In 
conjunction,  a  wavefront  language  (MDFL)  is  introduced  that  drastically 
reduces  the  complexity  of  the  description  of  parallel  algorithms  and 
simulates  the  vavefront  propagation  across  the  computing  netvrok. 
Together,  the  hardware  and  the  language  lead  to  a  programmable  Vavefront 
Array  Processor  (VIP).  The  VAP  blends  the  sdvantages  of  the  dedicated 
Systolic  array  and  the  general  purpose  Data-Flow  machine  and  provides  a 
powerful  tool  for  the  high  speed  execution  of  a  large  class  of  matrix 
operations  and  related  algorithms  which  have  widespread  applications. 


As  to  this  research  front,  it  hinges  upon  a  thorough,  in-depth 
understanding  of  mathematics  and  algorithm  analysis.  In  addition  to  the 
classical  mathematical  techniques  such  as  Jburier  transform,  linear 
dynamic  systems,  random  process,  etc.  there  arises  a  new  signal 
processing  mathematics  branch  which  can  be  grossly  termed  as  modern 
spectral  analysis.  Explicitly  or  not, a  large  class  of  signal  processing 
applications  have  had  extensive  use  of  this  analysis  as  a  technical 
basis.  Therefore,  our  research  effort  alms  at  developing  a  theoretical 
and  algorithmic  basis  for  modern  spectral  analysis  methods  and  signal 
processing  applications. 


2-1  Adaptive  lotch  Filterii 


rsc  [1-3 


Using  a  steady  state  frequency  domain  approach,  a  new  method  has  been 
developed  for  the  retrieval  of  sinusoid s/narrowband  signals  in  additive 
noise  colored  or  white.  The  method  suggested  has  been  shown  to  require 


smaller  filter  length  to  produce  unbiased  estimates,  compared  to  the 
existing  autoregressive  method.  For  its  implementation,  a  pole-sero 
filter  where  the  feedback  and  feedforward  coefficients  are  related 
(constrained  ABMA),  has  been  developed.  A  study  of  the  perfoxmance  and 
implementations!  aspects  of  the  filter  have  been  undertaken.  The 
details  of  this  newly  developed  ere  discussed  in  the  full  report.  For  a 
stable  implementation,  parallel  and  cascade  forms  have  been  shown  to  be 
useful.  A  parallel  processing  scheme  developed  shows  great  promise. 
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2.2  Relationships  B»tmn  Swnl  Popular  Methods  for  Spectral 
fctlmation  and  Array  Processing 

It  aay  seem  too  ambitious  to  c  cape  re  all  currently  popular  high- 
reaolutlon  spectral  estimation  methods.  However,  from  a  different 
perspective,  Pisarenko's  Method  can  be  viewed  as  an  extension  of  the  MEM 
method  with  the  removal  of  the  noise  contribution.  Therefore,  an  attempt 
is  being  made  at  developing  a  unified  framework  for  the  spectral 
analysis  techniques.  Moreover,  the  unification  attempt  is  being  extended 
to  the  counterpart  of  spectral  analysis  in  array  processing  application, 
for  which  we  are  convinced  that  the  general  principles  remain  largely 
applicable.  Ve  are  currently  looking  into  theoretical  and  computational 
relevances  between  several  recent  array  processing  and  spectrum 
estimation  methods. 

2.3  Toepllts  Approximation  Method  (DSCI4]) 

Recently,  the  study  on  approximation  theory  and  its  applications  has 
received  considerable  attention.  In  our  work,  a  narrowband/sinusoidal 
signal  retrieval  problem  is  formulated  in  terms  of  approximation  of 
TOeplits  autocovariance  matrix.  A  Toepllts  approximation  method  based 
on  singular  value  decomposition  is  proposed  and  simulation  results 
indicate  some  improvement  over  some  previously  propossd  methods. 

3  gig  OF  RESEARCH  ACTIVITIES  IH  IPI.  PSC  (A. A, Shwchuk.  8. thellappa) 
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3*1  Parallel  Algorithms  for  Image  Processing  and  Analysis 

Our  recent  research  at  Image  Processing  Institute^  SC,  has  been 
concerned  with  parallel  algorithms  for  image  processing  and  image 
analysis.  Most  of  the  effort  has  been  concerned  with  parallel 
implementation  of  nonstationary  adaptive  image  restoration.  Recursive 
and  non-recursive  implementation  of  locally  adaptive  restoration  has 
been  studied.  These  techniques  estimate  the  local  nonstationary  mean 
and  variance  of  ideal  scenes  from  degraded  data.  Most  blurring 
degradations  are  also  highly  local,  so  that  local  parallel  processing 
combined  with  the  nonstationary  image  model  data  can  be  used  to  minimize 
local  mean-square  erro  (MSB)  in  a  parallel  fashion.  Ve  have  shown  that 
local  USE  is  not  a  bad  error  criterion  for  image  processing,  as  opposed 
to  the  usual  global  USE  taken  over  the  entire  scene.  Global  USE  often 
does  not  correlate  well  with  hvnan  observer  judgments  of  image  quality. 

Ve  have  looked  at  the  application  of  these  techniques  to  systems  with 
coherent  speckle  noise,  such  ss  synthetic  aperature  (SAB)  imagery, 
coherent  sonar  and  acoustic  imaging.  Both  recursive  (Kalman-like)  and 
local  sectioned  parallel  implementations  are  being  studied  in  detail. 
In  addition,  we  have  began  investigations  on  parallel  feature  extraction 
for  texture  identification  and  texture  segmentation. 


g V^-1 


3*2  Two  Dimensional  Spectral  fttlmatlon 

Two-dimensional  spectral  estimation  is  of  Interest  in  image 
restoration,  filtering  of  SAR  Images  and  texture  classification.  Our 
recent  research  has  been  concerned  with  developing  systematic  methods 
for  2-D  spectral  estimation  frcm  raw  data  using  random  field  models.  Ve 
assume  that  the  given  finite  data  is  represented  by  an  appropriate 
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Gaussian  Markov  random  field  (MB?)  model 


This  assumption  reduces  the  spectral  estimation  problem  to  that  of 
estimating  the  appropriate  structure  and  the  paraneters  of  the  model. 
Br  using  specific  finite  toroidal  lattice  representations  and  Gaussian 
maximal  likelihood  estimates  me  have  developed  nee  2-D  spectral 
estimates.  It  turns  out  that  the  MB?  spec  trim  is  also  the  maximun 
likelihood  spectrin  arising  in  frequency -wave  number  analysis, 
furthermore,  the  sample  correlation  values  of  the  given  observations  in 
an  array  N  are  in  perfect  agreement  with  the  estimated  theoretical 
correlations  in  K  obtained  by  fcurier  inverting  the  MBF  spectrin. 
Thus  the  MBF  spectrum  developed  by  us  converges  to  tte  2-D  maximum 
entropy  spectral  estimate  asymptotically.  Currently  ve  have  begun 
investigations  on  parallel  implementation  of  the  algorithas  for  2-D 
spectral  estimation. 

In  addition,  me  are  also  investigating  the  use  of  another  class  of 
random  field  models  known  as  spatial  autoregressive  models  which  are 
white  noise  driven  non  causal  models  for  spectral  estimation. 

4  PAMLLB,  PBOCBSaiMQ  TBCBWIQOBS  FOB  SEISMIC 
PB0C«33IB0(j.Mendel.J.0outslas) 

Because  of  the  large  volume  of  information  involved  in  the  simulation 
and  processing  of  stiamlc  data,  and  the  amount  of  processing  required, 
parallel  techniques  have  begun  to  be  studied.  !he  recent  developieni  of 
VLSI  systems  and  the  growing  sophesticatlon  in  the  design  of  array 
processors  xan  lead  to  the  effeelent  simulation  of  large  seiasic  models. 
Ve  are  examining  some  possible  parallel  proeeesing  techniques  for  the 


computation  of  the  output  of  a  horizontally  stratified,  non  absorbtive 
aed list  in  which  there  are  vertically  travelling  plane  ccnpreasional 
waves. 

This  task  has  just  been  started  and  we  intend  to  look  at  different 
parallel  structures  for  generating  synthetic  seismograms. 
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